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ABSTRACT 

Summary: High-throughput technologies have led to an explosion of 
genomic data available for automated analysis. The consequent pos- 
sibility to simultaneously sample multiple layers of variation along the 
gene expression flow requires computational methods integrating raw 
information from different '-omics'. It has been recently demonstrated 
that translational control is a widespread phenomenon, with profound 
and still underestimated regulation capabilities. Although detecting 
changes in the levels of total messenger RNAs (mRNAs; the transcrip- 
tome), of polysomally loaded mRNAs (the translatome) and of proteins 
(the proteome) is experimentally feasible in a high-throughput way, the 
integration of these levels is still far from being robustly approached. 
Here we introduce tRanslatome, a new R/Bioconductor package, 
which is a complete platform for the simultaneous pairwise analysis 
of transcriptome, translatome and proteome data. The package in- 
cludes most of the available statistical methods developed for the 
analysis of high-throughput data, allowing the parallel comparison of 
differentially expressed genes and the corresponding differentially en- 
riched biological themes. Notably, it also enables the prediction of 
translational regulatory elements on mRNA sequences. The utility of 
this tool is demonstrated with two case studies. 
Availability and implementation: tRanslatome is available in 
Bioconductor. 
Contact: t.tebaldi@unitn.it 

Supplementary information: Supplementary data are available at 
Bioinformatics online. 

Received on July 2, 2013; revised on October 10, 2013; accepted on 
October 29, 2013 

1 INTRODUCTION 

High-throughput ('-omics') measurements of macromolecule 
variations in the cell offer the possibility to comprehensively 
understand how the cellular processes are regulated and to 
reveal how different layers of control are coordinated in produ- 
cing a physiologically coherent response. These measurements 
are also invaluable to understand how the loss of this coordin- 
ation contributes to disease origin. The establishment of high- 
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throughput technologies and the consequent explosion of avail- 
able data allow us to reach a 'systems' understanding of the 
variations in gene expression only when a parallel evolution of 
algorithms and data mining techniques is achieved. This eventu- 
ally enables to suggest and prioritize potential mechanistic 
processes. Nonetheless, the integration of '-omics' data, ranging 
from epigenetic chromatin remodeling to the dynamics of tran- 
scription, translation and protein activities, still requires consid- 
erable experimental and computational developments. 

In this context, the low correlation observed between messen- 
ger RNA (mRNA) and protein levels is an unsolved issue (Vogel 
and Marcotte, 2012). Recently we showed that the analysis of the 
translatome, an intermediate level between the transcriptome and 
the proteome formed by mRNAs engaged with polysomes, pro- 
vides substantial and somewhat surprising new information 
(Tebaldi et al., 2012). This and other examples (Colman et ai, 
2013; Schwanhausser et ai, 2011; Vogel et aL, 2010) show how 
the integration of '-omics' data can provide a biologically rele- 
vant outcome. 

Here we present tRanslatome, a new Bioconductor package 
for the analysis of differential profiles coming from transcrip- 
tome, translatome and proteome studies. tRanslatome will help 
to study mRNA and protein variations in an exhaustive way, 
providing specific tools for the comparison of polysomal mRNA 
with total mRNA or protein data. 

2 DESCRIPTION AND USAGE 

tRanslatome is a complete platform for the analysis and pairwise 
comparison of two '-omics' levels implemented as a 
Bioconductor package. It is developed to compare translatome 
data with transcriptome and/or proteome data. A general 
overview of the functions offered by tRanslatome is given in 
Figure 1 A. The package is conceptually organized in three mod- 
ules, described as follows: 

2.1 DEGs detection 

The only input required by tRanslatome is an expression matrix 
containing either read counts (from next-generation sequencing 
data) or normalized signals (from microarray or proteome 
experiments). To select differentially expressed genes (DEGs), 
the package offers, in the same computational environment, 
the integrative analysis of established and emerging statistical 
methods: (i) DEseq (Anders and Huber, 2010) and edgeR 
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Fig. 1. Outline of tRanslatome workflow and graphical outputs. (A) General overview of the three modules provided by tRanslatome. (B) Scatterplot of 
fold changes. Each gene is mapped according to the fold change in the transcriptome and the translatome. Different classes of DEGs are color labeled. 
The Spearman correlation coefficients are displayed for all genes and for all DEGs. (C) Radar plot of the top enriched GO biological process terms for 
the transcriptome and the translatome DEGs. (D) Heatmap of the top enriched post-transcriptional regulators for the transcriptome and the translatome 
DEGs. The color scale is based on the -loglO of the enrichment P- value, calculated with a Fisher test 



(Robinson et al, 2010), specifically implemented for the analysis 
of next-generation sequencing data; (ii) significance analysis of 
microarrays (SAM) (Tusher et al, 2001), developed for the ana- 
lysis of microarray data; (iii) f-test, RankProd (Breitling et al, 
2004), linear models and moderated Mest (Smyth, 2004), suitable 
to deal with general quantitative data; and (iv) methods dealing 
specifically with the comparison of translatome and transcrip- 
tome data, e.g. ANOTA (Larsson et al, 2011) and translational 
efficiency, derived from the ratio of polysomal and subpolysomal 
signals (Powley et al, 2009) or the ratio of ribosome protected 
fragments and RNA-seq reads (Ingolia et al, 2011). These tech- 
niques are described more exhaustively in the documentation and 
in the Supplementary Material. 

To study all the relevant differences arising from the two 
'-omics' levels, tRanslatome offers a variety of graphical outputs, 
helping the quality assessment and interpretation of the results. 
The minus-average plots aid the identification of intensity-de- 
pendent patterns, whereas the standard deviation plot can help 
the selection of the best method for DEGs identification (see 
Supplementary Figs 4 and 5). The graphics also include scatter- 
plots, displaying changes in the expression of genes in terms of 
fold changes at both levels (Fig. IB), and histograms, showing a 
detailed representation of all the DEGs classes (Supplementary 
Figs 2 and 3). 

2.2 Gene Ontology enrichment comparison 

One of the most frequent applications of the Gene Ontology 
(GO) is enrichment analysis, i.e. the identification of significantly 



overrepresented GO terms in a given gene set (Ashburner et al, 
2000). tRanslatome includes the detection and comparison of 
GO terms, resuming information about cellular components, 
molecular functions and biological processes associated to 
DEGs detected from the two '-omics' levels. Multiple choices 
are offered for the overrepresentation test, which exploits the 
GO 'tree' structure by means of the Bioconductor package 
'topGO': these choices can satisfy the need for either more gen- 
eral or more specific biological themes. To simplify the inspec- 
tion of the results and to effectively represent the differences in 
the enrichment of ontological terms, the corresponding radar 
plots and heatmaps can be produced (Fig. 1C and Supplemen- 
tary Figs 7 and 8). tRanslatome also provides methods for a 
sensitive comparison of the similarity between enriched GO 
terms, including the semantic similarity scores (Wang et al, 
2007) between terms at each level, and the global similarity 
score between the two levels. 

2.3 Enrichment analysis of post-transcriptional regulatory 
elements 

As tRanslatome focuses on the study of global translational con- 
trols, enrichment analysis of RNA binding proteins and 
microRNA binding sites or other RNA regulatory motifs (e.g. 
AU-rich elements) can be performed on the lists of DEGs. This 
analysis allows the user to identify possible regulatory factors 
responsible for the translational regulation of genes in the experi- 
ment under consideration (Fig. ID). The list of genes regulated 
by each post-transcriptional element is obtained from the 
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recently established Atlas of UTR Regulatory Activity (AURA) 
database (Dassi et al., 2012). The method computes a Fisher test 
P- value indicating whether binding sites for each regulator are 
significantly enriched in the DEGs lists. The annotations from 
AURA will be updated on every release of tRanslatome. Users 
can also specify a custom annotation file in place of the one 
provided by default. 

3 EXAMPLES 

A worked example, derived from data on differentiated versus 
undifferentiated human hepatocytes (Parent et al, 2007) is used 
to generate the panels contained in Figure 1. Detailed explan- 
ations of this example, along with a second example dealing with 
the comparison of the proteome and the transcriptome between 
two human cell lines (Stevens and Brown, 2013), are provided in 
the Supplementary Material. 

4 CONCLUSION 

tRanslatome allows a user-friendly comparison and integration 
of data generated from two '-omics' measurements, empowering 
the discovery of regulatory mechanisms underlying the uncou- 
pling processes among the transcriptome, the translatome and 
the proteome. 
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